Categorical FDA under prospective sampling scheme: a proposal for variable selection
نویسنده
چکیده
Given a population described by p explanatory and one dependent categorical variables, we assume that the dependent variable defines a partition of the population into g groups. Discriminant Analysis studies the relation between the p explanatory variables and the dependent variable finding the subset of variables that has the most predictive power. Generally, in categorical discriminant analysis, the a priori probabilities associated to the g groups are assumed known. In this paper we summarise some suitable approaches under the hypothesis of unknown group a priori probabilities and we propose a new variables selection algorithm.
منابع مشابه
Bayesian Versus Maximum Quasi-Likelihood Methods for Sire Evaluation with Categorical Data
Binary variables arising from an underlying normal distribution with a fixed threshold were simulated in a two-stage selection scheme with a sire model. The model had herd-year-seasons and groups as fixed effects and sires as random variables; a heritability of .25 was used in the simulation. The "best" 20% of the sires were allowed to have additional progeny in the second stage. The criteria u...
متن کاملCATEGORICAL DATA ANALYSIS , 3 rd edition Extra Exercises
This file contains extra exercises. Most of these were in the first or second edition of the text, did not fit in the 3rd edition. They are organized by chapter. Instructors are welcome to use them for homeworks or exams. 1. In the following examples, identify the response variable and the explanatory variables. 2. Which measurement scale is most appropriate for attitude toward legalization of ...
متن کاملAmazon Employee Access Control System
In this work, based on the history data of 20102011 from Amazon Inc., we build up a system which aims to take place of resource administrators at Amazon. Our analysis shows that the given dataset is highly imbalanced with categorical values. Thus in the preprocessing step, we tried different sampling methods, feature selection as well as one hot encoding to make the data more suitable for predi...
متن کاملDimension Reduction and Variable Selection in Case Control Studies via Regularized Likelihood Optimization
Dimension reduction and variable selection are performed routinely in case-control studies, but the literature on the theoretical aspects of the resulting estimates is scarce. We bring our contribution to this literature by studying estimators obtained via l1 penalized likelihood optimization. We show that the optimizers of the l1 penalized retrospective likelihood coincide with the optimizers ...
متن کاملPearson and Log-likelihood Chi-square Test of Fit for Latent Class Analysis Estimated with Complex Samples
In this note we discuss model fit evaluation for the Latent Class Analysis (LCA) model under complex sampling. Suppose that for the i− individual in the sample we observe r categorical/discrete variables Ui1, ..., Uir. Suppose that for individual i there exist one unobserved categorical variable Ci, called the latent class variable. The LCA model is described by the following equations P (Uij =...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001